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CLAIMS 

1. A method for providing at least one self-tuning object 
to a user program, comprising: 

receiving said user program; 

simulating execution of said user program; 

detecting, during said simulating of said execution of 
said user program, a plurality of expressions including said 
self -tuning object in said user program; 

generating, in response to said detecting said 
plurality of expressions including said self -tuning object 
in said user program, a trace file indicating a sequence of 
said expressions including said self-tuning object in said 
user program; 

dividing said trace file into a plurality of trace file 
blocks; 

converting said trace file blocks into source code 
expression blocks; 

generating a plurality of minimal timing, compiled 
expression blocks, each of said plurality of minimal timing, 
compiled expression blocks corresponding to a respective one 
of said source code expression blocks, said generating said 
plurality of minimal timing, compiled expression blocks 
including application of at least one compiler optimization 
technique; and 

linking said plurality of minimal timing, compiled 
expression blocks into said user program. 

2. The method of claim 1, wherein said detecting said 
plurality of expressions including said self-tuning object 
in said user program is performed by program code associated 



with at least one overloaded operator associated with said 
self -tuning object. 

3. The method of claim 1, wherein said generating a trace 
file indicating the sequence of said expressions including 
said self-tuning object in said user program is performed by 
program code associated with at least one overloaded 
operator associated with said self-tuning object. 

4. The method of claim 1, wherein said dividing said trace 
file into said plurality of trace file blocks is performed 
such that a total amount of computational dependencies and 
synchronization requirements within said user program, 
including computational dependencies and synchronization 
requirements between trace file blocks, are minimized. 

5. The method of claim 1, wherein said dividing said trace 
file into said plurality of trace file blocks is performed 
responsive to user provided delimiters included within said 
user program. 

6. The method of claim 1, wherein said generating said 
plurality of minimal timing, compiled expression blocks 
further comprises compiling and executing at least one of 
said expression blocks multiple times while varying a value 
of at least one optimization parameter for said at least one 
compiler optimization technique. 

7. The method of claim 6, wherein said generating said 
plurality of minimal timing, compiled expression blocks 



GAGNEBIN Si HAYES, 




-27- 

further comprises timing said multiple executions of said 
compiled expression blocks. 

8. The method of claim 1, wherein said linking of said 
5 minimal timing, compiled expression blocks to said user 

program is responsive to execution of said user program. 

9. The method of claim 8, wherein said linking of said 
minimal timing, compiled expression blocks further comprises 

10 detecting, during said execution of said user program, said 
plurality of expressions including said self-tuning object 
in said user program. 

10. The method of claim 9, wherein said linking of said 
15 minimal timing, compiled expression blocks further comprises 

scheduling said minimal timing, compiled expression blocks 
for execution on at least one processor of a target parallel 
processing computer. 

20 11. A computer program product including a computer 

readable medium, said computer readable medium having at 
least one computer program stored thereon, said at least one 
computer program comprising: 

program code for receiving said user program; 

2 5 program code for simulating execution of said user 

program; 

program code for detecting, during said simulating of 
said execution of said user program, a plurality of 
expressions including said self-tuning object in said user 
30 program; 
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program code for generating, in response to said 
detecting said plurality of expressions including said self- 
tuning object in said user program, a trace file indicating 
a sequence of said expressions including said self-tuning 
5 object in said user program; 

program code for dividing said trace file into a 
plurality of trace file blocks; 

program code for converting said trace file blocks into 
source code expression blocks; 
10 program code for generating a plurality of minimal 

timing, compiled expression blocks, each of said plurality 
of minimal timing, compiled expression blocks corresponding 
to a respective one of said source code expression blocks, 
said generating said plurality of minimal timing, compiled 
15 expression blocks including application of at least one 
compiler optimization technique; and 

program code for linking said plurality of minimal 
timing, compiled expression blocks into said user program. 

20 12. The computer program product of claim 11, wherein said 
program code for detecting said plurality of expressions 
including said self-tuning object in said user program 
comprises program code associated with at least one 
overloaded operator associated with said self -tuning object. 



13. The computer program product of claim 11, wherein said 
program code for generating a trace file indicating the 
sequence of said expressions including said self-tuning 
object in said user program comprises program code 
30 associated with at least one overloaded operator associated 
with said self-tuning object. 
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14. The computer program product of claim 11, wherein said 
program code for dividing said trace file into said 
plurality of trace file blocks is operative to divide said 
5 trace file into said plurality of trace file blocks such 
that a total amount of computational dependencies and 
synchronization requirements within said user program, 
including computational dependencies and synchronization 
requirements between trace file blocks, are minimized. 



15. The computer program product of claim 11, wherein said 
program code for dividing said trace file into said 
plurality of trace file blocks is operative to divide said 
trace file into said plurality of trace file blocks 

15 responsive to user provided delimiters included within said 
user program. 

16. The computer program product of claim 11, wherein said 
program code for generating said plurality of minimal 

20 timing, compiled expression blocks further comprises program 
code for compiling and executing at least one of said 
expression blocks multiple times while varying a value of at 
least one optimization parameter for said at least one 
compiler optimization technique. 



17. The computer program product of claim 16, wherein said 
program code for generating said plurality of minimal 
timing, compiled expression blocks further comprises program 
code for timing said multiple executions of said compiled 
30 expression blocks. 
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is. The computer program product of claim 11, wherein said 
program code for linking of said minimal timing, compiled 
expression blocks to said user program is triggered by 
execution of said user program. 

5 

19. The computer program product of claim 18, wherein said 
linking of said minimal timing, compiled expression blocks 
further comprises program code for detecting, during said 
execution of said user program, said plurality of 

10 expressions including said self-tuning object in said user 
program. 

20. The computer program product of claim 19, wherein said 
program code for linking of said minimal timing, compiled 

15 expression blocks further comprises program code for 
scheduling said minimal timing, compiled expression blocks 
for execution on at least one processor of a target parallel 
processing computer. 

20 21. The computer program product of claim 11, wherein said 
computer program comprises a compiler. 

22. A computer data signal embodied in a carrier wave, said 
computer data signal including at least one computer 
2 5 program, said at least one computer program comprising: 

program code for receiving said user program; 

program code for simulating execution of said user 
program; 

program code for detecting, during said simulating of 
30 said execution of said user program, a plurality of 
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expressions including said self-tuning object in said user 
program; 

program code for generating, in response to said 
detecting said plurality of expressions including said self- 
5 tuning object in said user program, a trace file indicating 
a sequence of said expressions including said self-tuning 
object in said user program; 

program code for dividing said trace file into a 
plurality of trace file blocks; 
10 program code for converting said trace file blocks into 

source code expression blocks; 

program code for generating a plurality of minimal 
timing, compiled expression blocks, each of said plurality 
of minimal timing, compiled expression blocks corresponding 
15 to a respective one of said source code expression blocks, 
said generating said plurality of minimal timing, compiled 
expression blocks including application of at least one 
compiler optimization technique; and 

program code for linking said plurality of minimal 
20 timing, compiled expression blocks into said user program. 

23. A system for providing at least one self-tuning object 
to a user program, comprising: 
at least one processor; 
2 5 at least one memory communicably coupled to said at 

least one processor; 

a computer program for execution on said processor, 
said computer program stored in said memory, said computer 
program comprising: 
30 program code for receiving said user program; 




program code for simulating execution of said user 
program; 

program code for detecting, during said simulating 
of said execution of said user program, a plurality of 
5 expressions including said self-tuning object in said 

user program; 

program code for generating, in response to said 
detecting said plurality of expressions including said 
self-tuning object in said user program, a trace file 
10 indicating a sequence of said expressions including 

said self-tuning object in said user program; 

program code for dividing said trace file into a 
plurality of trace file blocks; 

program code for converting said trace file blocks 
15 into source code expression blocks; 

program code for generating a plurality of minimal 
timing, compiled expression blocks, each of said 
plurality of minimal timing, compiled expression blocks 
corresponding to a respective one of said source code 
20 expression blocks, said generating said plurality of 

minimal timing, compiled expression blocks including 
application of at least one compiler optimization 
technique; and 

program code for linking said plurality of minimal 
25 timing, compiled expression blocks into said user 

program. 

24. A system for providing at least one self-tuning object 
to a user program, comprising: 
30 means for receiving said user program; 

means for simulating execution of said user program; 
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means for detecting, during said simulating of said 
execution of said user program, a plurality of expressions 
including said self-tuning object in said user program; 

means for generating, in response to said detecting 
5 said plurality of expressions including said self-tuning 
object in said user program, a trace file indicating a 
sequence of said expressions including said self-tuning 
object in said user program; 

means for dividing said trace file into a plurality of 
10 trace file blocks; 

means for converting said trace file blocks into source 
code expression blocks; 

means for generating a plurality of minimal timing, 
compiled expression blocks, each of said plurality of 
15 minimal timing, compiled expression blocks corresponding to 
a respective one of said source code expression blocks, said 
generating said plurality of minimal timing, compiled 
expression blocks including application of at least one 
compiler optimization technique; and 
20 means for linking said plurality of minimal timing, 

compiled expression blocks into said user program. 




